{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/llama_packs/multi_tenancy_rag/multi_tenancy_rag.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multi-Tenancy RAG\n",
    "\n",
    "This notebook shows how to implement Multi-Tenancy RAG with MultiTenancyRAGPack."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR OPENAI API KEY\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2024-01-15 17:38:30--  https://arxiv.org/pdf/2312.04511.pdf\n",
      "Resolving arxiv.org (arxiv.org)... 151.101.67.42, 151.101.195.42, 151.101.131.42, ...\n",
      "Connecting to arxiv.org (arxiv.org)|151.101.67.42|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 755837 (738K) [application/pdf]\n",
      "Saving to: 'llm_compiler.pdf'\n",
      "\n",
      "llm_compiler.pdf    100%[===================>] 738.12K  --.-KB/s    in 0.07s   \n",
      "\n",
      "2024-01-15 17:38:30 (10.1 MB/s) - 'llm_compiler.pdf' saved [755837/755837]\n",
      "\n",
      "--2024-01-15 17:38:31--  https://arxiv.org/pdf/2312.06648.pdf\n",
      "Resolving arxiv.org (arxiv.org)... 151.101.67.42, 151.101.195.42, 151.101.131.42, ...\n",
      "Connecting to arxiv.org (arxiv.org)|151.101.67.42|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 1103758 (1.1M) [application/pdf]\n",
      "Saving to: 'dense_x_retrieval.pdf'\n",
      "\n",
      "dense_x_retrieval.p 100%[===================>]   1.05M  --.-KB/s    in 0.1s    \n",
      "\n",
      "2024-01-15 17:38:31 (8.09 MB/s) - 'dense_x_retrieval.pdf' saved [1103758/1103758]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!wget --user-agent \"Mozilla\" \"https://arxiv.org/pdf/2312.04511.pdf\" -O \"llm_compiler.pdf\"\n",
    "!wget --user-agent \"Mozilla\" \"https://arxiv.org/pdf/2312.06648.pdf\" -O \"dense_x_retrieval.pdf\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import SimpleDirectoryReader\n",
    "\n",
    "reader = SimpleDirectoryReader(input_files=[\"dense_x_retrieval.pdf\"])\n",
    "dense_x_retrieval_docs = reader.load_data()\n",
    "\n",
    "reader = SimpleDirectoryReader(input_files=[\"llm_compiler.pdf\"])\n",
    "llm_compiler_docs = reader.load_data()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download `MultiTenancyRAGPack`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llama_pack import download_llama_pack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "MultiTenancyRAGPack = download_llama_pack(\n",
    "    \"MultiTenancyRAGPack\", \"./multitenancy_rag_pack\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "multitenancy_rag_pack = MultiTenancyRAGPack()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add documents for different users\n",
    "\n",
    "Jerry -> Dense X Retrieval Paper\n",
    "\n",
    "Ravi -> LLMCompiler Paper"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "multitenancy_rag_pack.add(documents=dense_x_retrieval_docs, user=\"Jerry\")\n",
    "multitenancy_rag_pack.add(documents=llm_compiler_docs, user=\"Ravi\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Querying for different users"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The paper mentions propositions as an alternative retrieval unit choice. Propositions are defined as atomic expressions of meanings in text that correspond to distinct pieces of meaning in the text. They are minimal and cannot be further split into separate propositions. Each proposition is contextualized and self-contained, including all the necessary context from the text to interpret its meaning. The paper demonstrates the concept of propositions using an example about the Leaning Tower of Pisa, where the passage is split into three propositions, each corresponding to a distinct factoid about the tower.\n"
     ]
    }
   ],
   "source": [
    "# Jerry has Dense X Rerieval paper and should be able to answer following question.\n",
    "response = multitenancy_rag_pack.run(\n",
    "    \"what are propositions mentioned in the paper?\", \"Jerry\"\n",
    ")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "LLMCompiler involves three key steps:\n",
      "1. LLM Planner: This component generates the sequence of different tasks and their dependencies based on user inputs. It identifies the execution flow and defines the function calls with their dependencies.\n",
      "2. Task Fetching Unit: This component dynamically determines which set of tasks can be executed in parallel. It dispatches the function calls after substituting variables with the actual outputs of the preceding tasks.\n",
      "3. Executor: This component executes the dispatched function calling tasks using the associated tools. It performs the necessary function calls based on the inputs provided by the Task Fetching Unit.\n"
     ]
    }
   ],
   "source": [
    "# Ravi has LLMCompiler\n",
    "response = multitenancy_rag_pack.run(\"what are steps involved in LLMCompiler?\", \"Ravi\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "I'm sorry, but I couldn't find any information about the steps involved in LLMCompiler in the given context.\n"
     ]
    }
   ],
   "source": [
    "# This should not be answered as Jerry does not have information about LLMCompiler\n",
    "response = multitenancy_rag_pack.run(\"what are steps involved in LLMCompiler?\", \"Jerry\")\n",
    "print(response)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py38",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16 (default, Mar  1 2023, 21:19:10) \n[Clang 14.0.6 ]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "64bcadabe4cd61f3d117ba0da9d14bf2f8e35582ff79e821f2e71056f2723d1e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
